MLP-ARD vs. Logistic Regression and C4.5 for PIP Claim Fraud Explication
نویسندگان
چکیده
In this paper we demonstrate the explicative capabilities of multilayer perceptron neural networks (MLP) with automatic relevance determination (ARD) weight regularization for personal injury protection (PIP) automobile insurance claim fraud detection. The ARD objective function hyperparameter scheme provides a means for soft input selection as it allows to determine which predictor variables are most informative to the trained MLP. The MLP (hyper-)parameters are trained using MacKay’s (1992;1994) evidence framework for classification implementation of Bayesian learning on a data set of closed PIP insurance claims from accidents that occurred in Massachusetts during 1993. We adhere to an experimental strategy of using the aggregated decision from a ten-fold cross-validated ensemble for robust predictor importance assessment. The findings of MLP-ARD are then compared to the predictor importance evaluation from popular logistic regression and (smoothed+curtailed) C4.5 decision tree classifiers trained on the same data and using an identical experimental setup. The results show good agreement between logistic regression and MLP-ARD, while somewhat less between C4.5 and MLP-ARD.
منابع مشابه
Cost-sensitive learning and decision making for massachusetts pip claim fraud data
In many real-life decision making situations the default assumption of equal (mis-)classification costs underlying pattern recognition techniques is most likely violated. Consider the case of insurance claim fraud detection for which an early claim screening facility is to be built to decide upon the nature of an incoming claim as either suspicious or not. This decision typically forms the basi...
متن کاملA Comparison of State-of-the-art Classification Techniques for Expert Automobile Insurance Claim Fraud Detection
Several state-of-the-art binary classification techniques are experimentally evaluated in the context of expert automobile insurance claim fraud detection. The predictive power of logistic regression, C4.5 decision tree, k-nearest neighbor, Bayesian learning multilayer perceptron neural network, least-squares support vector machine, naive Bayes, and tree-augmented naive Bayes classification is ...
متن کاملComparison of Methods and Software for Modeling Nonlinear Dependencies: A Fraud Application
In recent years a number of approaches for modeling data containing nonlinear and other complex dependencies have appeared in the literature. These procedures include classification and regression trees, neural networks, regression splines and naïve Bayes. Viaene et al (2002) compared several of these procedures, as well as a classical linear model, logistic regression, for prediction accuracy ...
متن کاملDistinguishing the Forest from the TREES: A Comparison of Tree- Based Data Mining Methods
One of the most commonly used data mining techniques is decision trees, also referred to as classification and regression trees or C&RT. Several new decision tree methods are based on ensembles or networks of trees and carry names like TreeNet and Random Forest. Viaene et al. compared several data mining procedures, including tree methods and logistic regression, for modeling expert opinion of ...
متن کاملDetecting Corporate Financial Fraud using Beneish M-Score Model
Detecting financial fraud is an important issue and ignoring this issue may cause financial and non-financial losses to individuals and organizations. The aim of this study is to test the ability of Beneish M-Score Model for detecting financial fraud among companies listed on Tehran stock exchange. The research sample consists of 137 companies listed on Tehran Stock Exchange for a period of 11 ...
متن کامل